adaptive regret
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.92)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Online Non-convex Learning in Dynamic Environments
This paper considers the problem of online learning with non-convex loss functions in dynamic environments. Recently, Suggala and Netrapalli [2020] demonstrated that follow the perturbed leader (FTPL) can achieve optimal regret for non-convex losses, but their results are limited to static environments. In this research, we examine dynamic environments and choose \emph{dynamic regret} and \emph{adaptive regret} to measure the performance. First, we propose an algorithm named FTPL-D by restarting FTPL periodically and establish $O(T^\frac{2}{3}(V_T+1)^\frac{1}{3})$ dynamic regret with the prior knowledge of $V_T$, which is the variation of loss functions. In the case that $V_T$ is unknown, we run multiple FTPL-D with different restarting parameters as experts and use a meta-algorithm to track the best one on the fly. To address the challenge of non-convexity, we utilize randomized sampling in the process of tracking experts. Next, we present a novel algorithm called FTPL-A that dynamically maintains a group of FTPL experts and combines them with an advanced meta-algorithm to obtain $O(\sqrt{\tau\log{T}})$ adaptive regret for any interval of length $\tau$. Moreover, we demonstrate that FTPL-A also attains an $\tilde{O}(T^\frac{2}{3}(V_T+1)^\frac{1}{3})$ dynamic regret bound. Finally, we discuss the application to online constrained meta-learning and conduct experiments to verify the effectiveness of our methods.
Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions
To deal with changing environments, a new performance measure--adaptive regret, defined as the maximum static regret over any interval, was proposed in online learning. Under the setting of online convex optimization, several algorithms have been successfully developed to minimize the adaptive regret. However, existing algorithms lack universality in the sense that they can only handle one type of convex functions and need apriori knowledge of parameters. By contrast, there exist universal algorithms, such as MetaGrad, that attain optimal static regret for multiple types of convex functions simultaneously. Along this line of research, this paper presents the first universal algorithm for minimizing the adaptive regret of convex functions. Specifically, we borrow the idea of maintaining multiple learning rates in MetaGrad to handle the uncertainty of functions, and utilize the technique of sleeping experts to capture changing environments.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > California (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.92)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Texas > Brazos County > College Station (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)